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QUASISEPARABLE HESSENBERG REDUCTION OF REAL DIAGONAL 
PLUS LOW RANK MATRICES AND APPLICATIONS* 
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Abstract. We present a novel algorithm to perform the Hessenberg reduction of an n x n matrix A of 
the form A = D -j- UV* where D is diagonal with real entries and U and V are n x k matrices with k < n. 
The algorithm has a cost of 0(rP‘k) arithmetic operations and is based on the quasiseparable matrix technology. 
Applications are shown to solving polynomial eigenvalue problems and some numerical experiments are reported 
in order to analyze the stability of the approach. 
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1. Introduction. Reducing an n x n matrix to upper Hessenberg form by a unitary sim¬ 
ilarity is a fundamental step at the basis of most numerical methods for computing matrix 
eigenvalues. For a general matrix, this reduction has a cost of 0{v?) arithmetic operations (ops), 
while for matrices having additional structures this cost can be lowered. This happens, for in¬ 
stance, for the class of quasiseparable matrices. We say that a matrix is (fcl, fc2)-quasiseparable 
if its submatrices contained in the strictly upper triangular part have rank at most k 2 and those 
contained in the strictly lower triangular part have rank at most fci. For simplicity, if fci = fc 2 = fc 
we say that A is fc-quasiseparable. Quasiseparable matrices have recently received much atten¬ 
tion; for properties of this matrix class we refer the reader to the recent books 0, 0, m, 

m- 

In the papers [4] and [8], algorithms are provided to reduce a fc-quasiseparable matrix A 
to upper Hessenberg form H via a sequence of unitary transformations. If A satisfies some 
additional hypothesis then the Hessenberg matrix obtained this way is still quasiseparable, and 
the cost of this reduction is 0{n?k°‘) ops, where a is strictly greater than 1, in particular a = 3 
for the algorithm of [2]. The advantage of this property is that one can apply the shifted QR 
iteration to the matrix H at the cost of 0{nk‘^) ops per step, instead of 0{n^), by exploiting 
both the Hessenberg and the quasiseparable structure of iJ, see for instance m, HD- This way, 
the computation of the eigenvalues of A has a lower cost with respect to the case of a general 
matrix. More specifically, assuming that the number of QR iterations is 0{n), the overall cost for 
computing all the eigenvalues turns to 0{n^k'^ + Ti?k°‘). Clearly, if a > 3, the advantage of this 
quasiseparable reduction to Hessenberg form can be appreciated only if k <^n. In particular, if 
the value of k is of the order of na these algorithms have cost O(n^) as for a general unstructured 
matrix. 

In this paper we consider the case of a fc-quasiseparable matrix which can be represented as 

A = D + UV*, U,VgC”^*^, (1.1) 

where D is a diagonal matrix with real entries and U,V G y* jg Hermitian 

transpose of V. Matrices of this kind are encountered in the linearization of matrix polynomials 
obtained by the generalization of the Smith companion form 0, 0- They have been also used 
in the package MPSolve v. 3.1.4 for the solution of polynomial and secular equations up to any 
desired precision [3]. 

We prove that if iJ = QAQ* is the Hessenberg form of A, with Q unitary, then H is 
(1, 2k— l)-quasiseparable, moreover, we provide an algorithm for computing H with 0{n'^k) ops. 
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This algorithm substantially improves the algorithms of [1] and whatever is the value of k. 
Moreover, for an unstructured matrix where k = n, the cost of our algorithm amounts to O(n^), 
that is the same asymptotic cost of the Hessenberg reduction for a general matrix. 

An immediate consequence of this algorithm is that the cost for computing all the eigenvalues 
of A by means of the shifted QR iteration applied to H turns to 0(v?k) with an acceleration by 
a factor of with respect to the algorithms of |1] and [S]. 

Another application of this result concerns the solution of polynomial eigenvalue problems 
and is the main motivation of this work. Consider the matrix polynomial P{x) = 

Pi S where for simplicity we assume = I. The polynomial eigenvalue problem consists 

in computing the solution of the equation det P{x) = 0 and, if needed, the nonzero vectors v 
such that P{x)v = 0. 

The usual strategy adopted in this case is to employ a linearization L{x) = xl — A oi 
the matrix polynomial, such that the linear eigenvalue problem L{x)w = 0 is equivalent to the 
original one P{x)v = 0. The matrix A, also called companion matrix of P{x), is of size mn x mn. 
Many companion matrices do have a fc-quasiseparable structure where k = m and in the case of 
the Smith companion [3], generalized to matrix polynomials in [2], the quasiseparable structure 
takes the desired form A = D + UV*. In this case, the cost of our algorithm to reduce A into 
quasiseparable Hessenberg form turns to 0{nm^) whereas the algorithms of [4] and [8] of cost 
0(nmA'^^), where a > 1, would be unpractical. 

This paper is divided in 6 sections. Besides the introduction, in Section we introduce 
some preliminary results, concerning quasiseparable matrices, which are needed to design the 
algorithm for the Hessenberg reduction. In Section we recall the general algorithm for reducing 
a matrix into Hessenberg form by means of Givens rotations and prove the main result, expressed 
in Theorem |3.2[ on the conservation of the quasiseparable structure at all the steps of the 
algorithm. In Section]^ we recall and elaborate the definition of Givens vector representation of 
a symmetric fc-quasiseparable matrix. Then we rephrase Theorem |3. 2 1 in terms of Givens vector 
representations. Section deals with algorithmic issues: the fast algorithm for the Hessenberg 
reduction is presented in detail. In Section]^ we present some numerical experiments and show 
that the GPU time needed in our tests confirms the complexity bound 0{'n?k). Finally the last 
Section briefly shows an application of these results to numerically computing the eigenvalues 
of a matrix polynomial. 


2. Preliminary tools. Throughout the paper the matrix A has the form (1.11 and P[ = 
QAQ* is in upper Hessenberg form, where Q is unitary, i.e., Q*Q = I. Here we recall the 
notations and definitions used in this paper, which mainly comply with the ones used in na, 
together with the main results concerning quasiseparable matrices. 

Definition 1. A complex matrix A is lower-quasiseparable (resp. upper-quasiseparable^ 
of rank k if every submatrix contained in the strictly lower (resp. upper) triangular part of A 
has rank at most k. If A is ki lower quasiseparable and upper quasiseparable we say that A is 
{ki, ku)-quasiseparable. If k = ki = k^ we say that A is k-quasiseparable. Moreover, we denote 
QSTL^ the set ofnxn Hermitian k-quasiseparable matrices with entries in C. We will sometimes 
omit the superscript n and simply write QSHk when the dimension is clear from the context. In 
general, we say that A is quasiseparable if it is {ki, ku)-quasiseparable for some nontrivial ki,ku. 

The definition of quasiseparable matrix can be easily expressed by using the MATLAB 
notation. In fact, A is lower-quasiseparable of rank k if and only if 


rank(A[i -\-1 : n,l : i]) ^ k for i = 1,..., n — 1, 

where A[ii : i 2 ,ji : J 2 ] denotes the submatrix of A formed by the entries aij for i = ii,... ,i 2 , 
j jl-} ■ • ■ J 2 • 
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Given a vector v = (vi) G C^, denote by G = G{vi,V 2 ) a 2 x 2 Givens rotation 


G = 


c 

—s 



c G M, |sp + = 1, 


such that Gv = aci, where |a| = More generally, denote Gi = /i_i 0 G0 

the nxn matrix which applies the Givens rotation G = [] to the components i and 1 01 of 
a vector, where Im denotes the identity matrix of size m and A(B B denotes the block diagonal 
matrix [ ^ ] • With abuse of terminology, we will call Gi Givens rotations as well. 

The following well-known result [13] will be used in our analysis: 

Lemma 2.1. Let Q be a unitary Hessenberg matrix. Then 

• Q is a {l,l)-yuasiseparable matrix. 

• Q can be factorized as a product of n — 1 Givens rotations Q = G\ ... G„_i. 

In the following we use the operators tril(-, •) and triu(-, •), coherently with the corresponding 
MATLAB functions, such that L = tril(A, fc), U = triu(A, fc) where A = (aij), L = (ii.j), 
U = (uij) and 


I Oi j iii^ j - k I Oij if i ^ j - fc 

rij = \ 3 = { 

I 0 otherwise ’ I 0 otherwise 

2.1. A useful operator. Another useful tool is the operator t : C”^" —>■ C”^" defined by 

t{A) = tril(A, —1) 0 triu(A*, 1). (2.1) 

Observe that if A has rank k then t{A) is a Hermitian fc-quasiseparable matrix with zero diagonal 
entries. In particular, for u,v € C", the matrix t{uv*) is in QSTCf and its entries are independent 
of Ml and of m„. More generally, for U,V G the matrix t{UV*) is in QSTCf and its entries 

are independent of the first row [/[I,:] of U and of the last row V[n, :] of V. Observe also that 
t{A) is independent of the upper triangular part of A. 

The following properties can be verified by a direct inspection 

t{A + B)= t{A) 0 t{B), for any A, B, G C”^”, 

t{aA) = at{A), for any a G K, (2.2) 

t{DAD*) = Dt{A)D*, for any D diagonal matrix. 


moreover. 


( 

Ai,i Ai,2 


t(Aiq) 

4 * 

^ 2,1 

1 

A2,1 A2,2 

)- 

^ 2,1 

1 (^ 2 , 2 )_ 


where Ai^i and ^ 2,2 are square matrices. We also have 


(2.3) 


t{A) = A — diag(aiq,...,««,«), for any A such that A = A*. (2.4) 

We analyze some properties of the residual matrix R = t{SAS*) — St{A)S*, for S being a 
unitary upper Hessenberg matrix, which will be used to prove the main result in the next section. 
We start with a couple of technical lemmas. 

Lemma 2.2. Let Z G and set S = Z (B In-k where n > k. Then for any A G C"^" it 

holds that 


t{SAS*) - StiA)S* = W® 0„_fc, 


3 









for some W G where On-k is the null matrix of size n — k. Similarly, for S = In-k (B Z it 

holds that t{SAS*) — St{A)S* = On-k © W, for some W € The same properties hold if 

In-k is replaced by a diagonal matrix Dn-k- 


Proof. Concerning the first part, partition A as A = 
SAS* = 


^1,1 Ai,2 
A.2.1 A2,2 


ZAi^iZ* ZA\^2 
A 2 ,iZ* A 2.2 


. In view of (2.3) we have 


t{SAS*) = 


t{ZAi^iZ*) Z^ 2 .i' 
A2,iZ* ^(^ 2 , 2 ) 


where Ai^i G so that 


(2.5) 


On the other hand, 

St{A)S* = S 


t(Ai,i) 

A* 

^2,1 

0* 

'Zt{Ai,i)Z* 

ZA*,i' 

^ 2,1 

t(A2,2)_ 

0 - 

A2.1Z* 

t(^2,2)_ 


( 2 . 6 ) 


So that, from (2.5) and (2.6) we get t{SAS*) — St{A)S* = W (B 0„_fc, with W = t(ZAi^iZ*) — 
Zt(Aij)Z*. The second part can be proved similarly. Finally, if I„-k is replaced by the diag¬ 
onal matrix Dn-k the same properties hold since for a diagonal matrix D one has t{DAD*) — 
Dt{A)D* = 0 in view of \2.2\. □ 


Lemma 2.3. Let S = (Z ® In- 2 ) 
matrix A partitioned as A = 


1®S) where Z gC^^^, S Then for 

G C"''", where A G it holds that 


any 


t{SAS^) - StiA)S* = IF © On-2 + iZ(B In-2){0 © {t{SAS*) - St{A)S*)){Z* © /„_2). 
for some W S 

Proof. Set i? = (1 © S')A(1 © S*) then by Lemma 


2.2 


t{SAS*) = tiiZ © In-2)B{Z* © In- 2 ) = IF © On-2 + [Z (B In-2)t{B){Z* © In- 2 ). 
On the other hand 

St{A)S* = {Z(B /„-2)(l © §)t(^)(l © S*){Z* © In- 2 ). 

Thus 

tiSAS*) - St{A)S* = IF © 0„_2 + {Z(B In-2)E{Z* © In- 2 ), 


where E = t{B) — (1 ©S')f(A)(l © S'). Now, since B = (1 © S')A(1 © S'*), in view of (2.3) we have 

, t{B) = 


B = 


01,1 u*S* 
Sv SAS* 


Sv t{SAS*) 


A similar analysis shows that 

(l©S)t(A)(l©S*) = 


Sv StiA)S* 


Thus we get 

t{SAS*) - StiA)S* = IF © 0„_2 + {Z(B /„-2)(0 © {t{SAS*) - St{A)S*)){Z* © In- 2 ). 
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□ 

A consequence of the above results is expressed by the following 

Theorem 2.4. Let A G and set Q = Gh---Gk for l<h<k<n — 1, where 

the parameters Si,Ci defining Gi are such that Si 0, i = h,...,k. Then := t{QAQ*) — 
Qt{A)Q* = diag((i) + t{ab*) G QSTLf for vectors a,b,d G C", where b is independent of A. More 
precisely, bh = Sh • • ■ Sk, bi = Ci_iSi ■ ■ ■ Sk, for i = h + 1, ■■■, k, bi = at = di = 0 for i < h and 
i > k + 1. In particular, if h > 1 then i?„ei = 0. 

Proof. Clearly the matrix Q has the form Ih-i(BZk-h+ 2 (BIn-k-i where Zk-h +2 is a unitary 


Hessenberg matrix of size k — h + 2. In view of Lemma 2.2 we can write i?„ = 0/j_i 0 Rk-h +2 ® 
Qn-k-i and this immediately proves the last statement of the Theorem. Moreover, it follows 
that Oi = bi = di = 0 for i = 1 ,... ,h — 1 and for i = k + 2, ... ,n so that it is sufficient to 
prove the claim for Rk-h+ 2 - Equivalently, we may assume that h = 1 and k = n — 1 so that 
Si 0 for i = 1,... ,n — 1. We prove that i?„ G QSHf by induction on n. For n = 2 it holds 
that i ?2 = [a o]' This way one can choose 02 = a/si and bi = si. For the inductive step, 
let n > 1 and observe that Q can be factorized as Q = {Z (B In- 2 ){^ 0 S) for Z = [As c]i and 


S G 
yields 


where for notational simplicity we set s = si, c = ci. Applying Lemma 


Rn=W® On-2 0 0 /„- 2)(0 0 Rn-l){Z* 0 In- 2 ) 


2.3 


for 


W = t{Z 


<^1,1 0.1,2] Oi,2]\7=(= 

02,1 02,2] ^ ) •^^VLo 2,1 02,2jJ-^ 


— c(sa2,i+-so2,i) 1+sa 1,2 ~ ca2,2 ~ <502,1) 

— s(cai,i+sai, 2 —ca2,2 — 502 , 1 ) c(sa2,i+(.sa2,i)) 


where i?„_i = t{SAS*) — St{A)S* and A is the trailing principal submatrix of A of size n — 1. 
A direct inspection shows that 


Rn — IT 0 0„_2 


\sfelRn-iei 

sefRn-iD 

^DRn-lCi 

DRn-lD 


D — c(B In- 2 - 


(2.7) 


From the inductive assumption we may write that Rn-i = diag(d) 0 t{db*) for a,b,d € C” 
where 5i = S 2 • • • s„_i ^ 0, bi = CiSi+i - - - Sn-i- So that (2.7| turns into 


R 


n 


■Wi,! Wi,2 
^2,1 'U?2,2 


0 


|spdi 

* 

* • • • ... * 

dies 

c^di 

* ... ... * 

a2bis 

a2bic 

d2 ■ ■ ■ ■ : 



a3b2 da ’ ■ ■ 



; ■. '. * 

a„-ibis 

dn-lblC 

Q-n — 1^2 • • • <^n —l^n —2 d-n—l 


where the upper triangular part, denoted with *, is determined by symmetry. Thus, it follows 
that Rn = diag(d) 0t(a, b) where di = |sipdi 0Wi,i, ^2 = c^di 0 1^2,2j di = di-i, for z = 3 ,..., n; 
moreover 

02 = ^(csdi 0 W2 1 ), 
bi 

ai=di-i, for z = 3 ,..., n, 

bi = si&i, 62 = Ci&i, 

bi = bi-i, for z = 3 ,..., rz — 1 , 
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where the condition Si ^ 0 implies that bi ^ 0. This completes the proof. □ 

The property that i? = + t{ab*), where b is independent of A enables us to prove the 

following 

Corollary 1. In the hypotheses of Theorem \2.4\ we have 

R = t{QAQ*) - Q{D + t{A))Q* G QSRf 


for any diagonal matrix D with real entries. Moreover, R = diag(d) + t{ab*) for some vectors 
a, 6 G C" and d G K". 

Proof. Without loss of generality we may assume that Si ^ 0 for i = 1,..., n — 1. In 
view of Theorem |2.4[ t{QAQ*) — Qt{A)Q* = diag(d) + t{ab*), where b is independent of A. 
On the other hand equation (2.4) implies that QDQ* = diag(d') + t{QDQ*), moreover, since 


t{D) = 0, Theorem 2.4 implies that t{QDQ*) = t{a'b*) for some vector o'. Thus we get QDQ* = 


diag(d') + t(a'6*). Therefore R = diag(d)+t(a6*) —diag(d') —t(a'6*) = diag(d + d')+t((a —a')6*), 
that is, R G QST-Cf. □ 

A further analysis enables us to provide an explicit representation of the ith row of the 
matrix R. More precisely we have the following result: 

Theorem 2.5. Under the assumptions of Theorem \2.4\ the ith row of the matrix R = 
t{QAQ*) — Q{D + t{A))Q* has the representation 


ef i? = [0,..., 0, d„ w*]G*_,GU ■■■Gl 


( 2 . 8 ) 


where, Wi, di G C, Wi G C”“* and di = Vi^i. 

Proof. Let us write Q = Q 1 Q 2 where Qi = Gi • • • Gi- 2 , Q 2 = Gi-i ■ ■ ■ G„_i, so that (2.8) 
can be rewritten as ef RQi = [0,..., 0, Vi, di, w*]. This way, it is enough to show that the ith 
row of RQi has the first i — 2 entries equal to zero. In view of Lemma |2.2| we have 


R' := t{Q2AQ*) - Q2{D + t{A))Q* = 0,_2 ®Rg QSR^ 


Whence 


Q2{D + t(^A))Q2 — t{Q2AQ2) — Oi—2 0 R. 

Moreover, by dehnition of R we have 

RQi = t{QiQ2AQ2Ql)Qi — QiQ2{D + t{A))Q2. 


(2.9) 


Setting B — Q 2 AQ 2 and combining the above equation with (2.9) yields 
RQi = t{QiBQl)Qi - Qit{B) + Qi(0,_2 © R). 

Now, since Qi{0i-2 0 R) has the hrst i — 2 columns equal to zero, it is sufficient to prove that the 
ith row of t{QiBQl)Qi — Qit{B) has the hrst i — 2 component zero. To this regard, observe that 


Qi = Qi® In-i+i, where Qi G 1 )^ so that partitioning B as 

C(*-i)x(*-i)^ by applying again Lemma 


2.2 


Bi,i 

82,1 B' 


Si,2 
2,2 


, where Bi iG 

_ _ _we hnd that t{QiBQl) - Qit{B)Ql = Wi-i®0n-i+i 

for some Wi-i G This implies that t{QiBQ\)Qi — Qit{B) has the last (n — i + 1) 

rows equal to zero. This completes the proof. □ 

Observe that the representation of R that we have found is exactly the Givens-Vector rep¬ 
resentation for quasiseparable matrices that is presented in |13j . A more general analysis of this 
representation is given in Section]^ 
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3. Reduction to Hessenberg form. It is a simple matter to show that the Hessenberg 
form H = QAQ* maintains a quasiseparable structure. In fact, since D is real, we find that H = 
QDQ* + QUV*Q* is the sum of a Hermitian matrix S = QDQ* and of a matrix T = QUV*Q* 
of rank k. Since H is Hessenberg, the submatrices contained in the lower triangular part of H 
have rank at most 1 so that the submatrices in the lower triangular part oi S = H — T have rank 
at most k 1. On the other hand, since S is Hermitian, then also the submatrices contained in 
its upper triangular part have rank at most fc + 1, thus the submatrices in the upper triangular 
part oi H = S + T have rank at most 2fc + 1 being the sum of the ranks of the corresponding 
submatrices of S and T respectively. This way we find that H is (l,2fc + l)-quasiseparable. 
Actually, we will see later in Section [3.1| that the submatrices contained in the upper triangular 
part of H have rank 2fc — 1. 

The nontrivial problem is to exploit this structure property and to provide a way to compute 
the matrix H with a cost lower than 0(v?) ops needed for general matrices. In order to arrive at 
this goal we have to recall the customary procedure for reducing a matrix A into Hessenberg form 
which is based on Givens rotations. The algorithm can be easily described by the pseudo-code 
reported in Algorithmwhere the function givens(wi, ^ 2 ) provides the matrix G(vi,t> 2 ). 

Algorithm 1 Reduction to Hessenberg form by means of Givens rotations 
1 : for j = 1 ,..., n — 2 do 
2 : for i = n,..., j -I- 2 do 

3 : G ^ givens(A[i - l,j], j]) 

4: A\i — 1 : i,:] <— G • A\i — 1 : i,:] 

5: A[:, i — 1 : i] ^ A[:, i — 1 ■. i]- G* 

6: end for 

7: end for 


Denote by Gi_i j the unitary n x n matrix which performs the Givens rotation G = 
givens(A(i — l,j),A{i,j)) in the rows i — 1 and i at step j of Algorithm and set Qj = 
Gj+ijGj+ 2 ,j ■ ■ • Gn-i,j- Then Algorithm [l] generates a sequence Aj of matrices such that 


Aq — A, A„-2 — H 

A, =Q,A,_iQ*, j = l,...,n-2 


where the matrix Aj has the form 


A, =Qj...Qi(D + UV*)Q; ...Q* = D + UV* = 




JA 


a 


JA 


(/) 


X 


, (3.1) 


and the symbols x and * denote any numbers. 

For notational simplicity, we denote by Aj the iji — j) x {n — j) trailing principal submatrix 
of Aj, that is, the submatrix represented by * in (3.1). Observe that Aj is the part of Aj that 


has not yet been reduced to Hessenberg form. Finally, by following the notation of Section we 
write Gi or Gij to denote a unitary matrix which applies a Givens transformation in the rows i 
and i -I- 1. 
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The following lemma is useful to get rid of non-generic cases in the process of Hessenberg 
reduction. 

Lemma 3.1. Let v € C”, v ^ 0 and consider Givens rotations G'i,...,G„_i constructed 
in such a way that (Gi... G„_i)w = 0, ..., 0)*, where S C®, for i = 1, ..., n — 1. 

If there exists h such that Gh = I then one can choose Gi = I for every i ^ h, that is, 

Proof. Since Gi ■ ■ ■ G„_i is a unitary matrix which acts in the last n — i + 1 components of 
V, the 2-norm of v[i : n] coincides with the 2-norm of {Gi ■ ■ ■ Gn-iv)[i : n], that is, On 

the other hand, if G?i = J then q, ..., 0) = (rcO+i)*, 0,..., 0) so that = 0. This 

implies that ||u[/i -1-1 : n]|| =0, whence = 0 for z = h -|- 1,..., n. This way, one can choose 
Gi = I ioT i = h + 1,... ,n — 1. □ 

• G„_i is in upper Hessen- 


Observe that in view of Lemma 2.1 the unitary matrix Q = G 2 


berg form. Moreover, the rotations Gi are such that QAci = aei + / 3 e 2 . In view of Lemma |TT] 
we can assume that if Gh = I then Gi = I for every i^ h, that is, Q = G 2 ... Gh-i- 

The quasiseparable structure of the matrices Ai is a consequence of our main result which 
is reported in the following 

Theorem 3.2. Let U,V,W € S = diag((i) +t(ab*) € QSLCl and define 


A = UV* + t{UW*) + S. 


Let Gi, i = 2,... ,n — 1 be Givens rotations acting on the rows i and z -|- 1 such that 
QAei = oi^iCi + / 3 e 2 , where Q = G 2 ■ ■ ■ G„_i. 

Then the matrix A obtained by removing the first row and the first column of QAQ* can he 
written again as 

A = UV* +t{UW*) + S 

where U,W G g = diag{d + t{db*)) G for some vectors d,d,b G C®®”^. 

Moreover, U and V are obtained by removing the first row of QU and QV, respectively. 

Proof. According to Lemma |3.1[ we may assume that in the first step of the process of 
reduction in Hessenberg form, the parameters Si satisfy the condition 7 ^ 0 for z = 2,... ,h, 
while Si = 0, for z = h -I- 1,..., n — 1, for some h < n — 1. In view of Lemmawithout loss of 
generality we may assume that h = n — 1. We have 


QAQ* = {QUYQVy + F, F = Q{t{UW*) + S)Q*. 

In view of Corollarywe have F = t{Q{UW* + ab*)Q*) — R ior R G QSHi. Thus 

QAQ* = {QU){QV)* + t{QU{QW)*) + t{Qa{Qh)*) - R. (3.2) 


Recall from Theorem |2.4| that i?ei = 0 and that Q has been chosen so that QAQ*ei = aei -l-,de 2 . 
This fact, together with (3.2), implies that the vector u = t{Qa{Qb)*)ei is such that m[ 3 : n] is 
in the span of the columns of {QU)[3 : n, :]. In view of (2.3) we may write t{Qa{Qb)*)[2 : n, 2 : 
n] = t{uz*) for u = u[2 : n], and for a suitable z G Applying (2.2) yields the following 

representation for the trailing principal submatrix A of QAQ* of size rz — 1 


A = UV* -f t{UW* +ur)-R 


where U, V and W are obtained by removing the first row of U, V and W, respectively, while 
R = R[2 : n, 2 : n]. Since u[2 : n] is in the span of U[2 : n,:], and since the hrst row of U as well 






as the first entry of u do not play any role in the value of t{UW* + ), we may set ui equal 

to an ai^ropriate value in such a way that u is in the span of th^olumns of U. This way, the 
matrix UW* + mz* has rank at most k and can be written as UW* for a suitable W G 
Thus we have 


A = UV* + t{UW*) + S 


for S = —R, that concludes the proof. □ 


Observe that the matrix A defined in (1.1) satisfies the assumptions of the above theorem 


with W = 0 and S = D real diagonal. This way, Theorem |3.2| shows that the trailing principal 
submatrix Aj of the sequence generated by the Hessenberg reduction, maintains the structure 
t{UjW*) + Sj where Sj = diag{dj) + t{ajb*). 

This fact is fundamental to design fast algorithms for the Hessenberg reduction of a matrix 


A, = u,v; 


of the form (1.1). In order to realize this goal, we need to choose a reliable explicit representation 


for t(UjW*) and Sj. This is the topic of the next section. 


Note also that the representation 


= 


UjV* + t{UjW*) + Sj implies that Aj is still a 


quasiseparable matrix of low quasiseparability rank. In fact, t{UjWj) + UjV* is (at most) 
(fc, 2fc)-quasiseparable, where k is the number of columns of Uj. The sum with Sj provides a 
{k + I, 2fc + 1)-quasiseparable matrix. We will prove in the next Theorem 3.3 that after the first 


step it is possible to replace Uj, Vj and Wj with n x (fc — I) matrices, thus showing that Aj is a 
(fc — I, 2fc — l)-quasiseparable matrix at all the intermediate steps of the Hessenberg reduction. 

3.1. Analysis of the first step. Recall that the matrix A = Aq is of the kind A = D+UV*. 
This is a particular case of the form A = UV* +t{UW*)+S above where W = 0 and S is diagonal. 
Because of this additional structure, the first Hessenberg redu^on step is somewhat special and 
we can get a sharper bound for the quasiseparability rank of Ai. This is shown in the next 


Theorem 3.3. Assume that the matrix A of (1.1) does not have the first column already in 


Hessenberg form, ie., A[3 : n, 1] 7 ^ 0. Then the rank of the matrix obtained by removing the first 
two rows of QiU is less than k — 1. Moreover, the (n — 1) x (n — 1) trailing principal submatrix 
Ai o/Ai = QiAQl has lower quasiseparable rank k. 

Proof. Observe that Ai can be written as the sum of a Hermitian matrix and of a matrix of 
rank at most k, namely, Ai = QiDQl + U\Vf, where Hi, Vi and Qi are the matrices obtained 
by removing the first row of QiU, QiV, and Qi, respectively. Define x = V*ei so that x yf 0 
and Aei = diCi + Ux. Since QiAci = aiCi -I- / 3 ie 2 , we find that QiUx = QiAci — diQiei = 
(«! — di)ei + fiie 2 , which implies Uix = fiiCi. Thus, C/i[2 : n — 1,:] has rank at most k—\. This 
matrix coincides with the matrix obtained by removing the first two rows of QiU so that the 
first part of the theorem is proven. Since Hi [2 : n — 1,:] has rank at most k — 1, then also the 
matrix Hi [2 : n — 1, has rank at most k — 1. We can conclude that every submatrix in the 
strictly lower triangular part of Ai, given by the sum of a submatrix of Hi [2 : n — 1, which 
has rank at most fc — 1 , and a submatrix in the lower triangular part of QiDQ’^ which has rank 
at most 1 in view of Corollary can have rank at most k. This completes the proof. □ 

It is important to note that we are not tracking the structure of the whole matrix but only 
the structure of the trailing part that we still need to reduce. This is not a drawback since 
the trailing part is the only information needed to continue the algorithm. Moreover, the entire 
Hessenberg matrix can be recovered at the end of the process by just knowing the diagonal and 
subdiagonal elements that are computed at each step together with the matrices H „_2 and ldi- 2 - 

4. Representing quasiseparable matrices. Finding good and efficient representations of 
quasiseparable matrices is a problem that has been studied in recent years. Some representations 
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have been introduced and analyzed for the rank 1 case. Some of them have been extended to (or 
were originally designed for) the higher rank case. 

In this section we provide a representation of quasiseparable matrices which, combined with 
the results of the previous section, enables us to design an algorithm for the Hessenberg reduction 
of the matrix A with cost 0{n^k). 


4.1. Givens Vector representations. A useful family of representations for quasisepara¬ 
ble matrices is described in [13] (for the 1-quasiseparable case) and is extended to a more general 
version in m- We use this kind of representation and adjust it to our framework. For the sake 
of clarity, we provide the details of the new notation in order to make this section self-contained. 

Definition 2. A tuple Q = on some ordered index set (/, ^) is said a sequence 

of Givens rotations. We also define JliG/ product in ascending order while niGrev(i) 

denotes the product in descending order. The following operations on Q are introduced: 

• Sv ■= nzGrev(I) G^V, for V € C”; 

• G*V := Hig/ for v S C”; 

• for J Q I, with the induced order, we call G[J] '■= the slice of G on the indices 

J; 

• for Givens sequences G = G' = {G'fj^^j , we define the product GG' to he the 

sequence 


GG' 


{Ei) 


iG/Uj ’ 


E, = 


G, if tel 
G' ift e J 


where U is the disjoint union operator and where the order on I U J is induced by the 
ones on I and J and by the agreement that Gi < G) for every i e I,j e J. 

The above definitions on the product between a sequence and a vector trivially extend to products 
between sequences and matrices. For instance, GA := niGrev(i) 

We are interested in the cases where the index sets I are special, in particular we consider 
the case of univariate sets / C N, and the case of bivariate sets / C N^. Let us give first the 
more simple definition of 1-sequences, which covers the case of univariate sets, and then extend 
it to fc-sequences for fc > 1, that is, the case of bivariate sets. 

Definition 3. We say that G is a 1-sequence of Givens rotations if G = (G2 ,..., G„_i). 
Notice that in this context, the operations already introduced in Definition specialize in 
the following way: 

• Gv := Gn-i.. ■ G 2 V, for v S C"; 

• G*v := Gl...Gl_^v, for v G C"; 

• G[i : j] := (Gi ,..., Gj), for 2 ^ i < j ^ n — 1, is a slice of G from i to j. 

Here, and hereafter, we use the notation i : j to mean the tuple (i,i + 1,... ,j) for i < j. 
Sometimes we use the expressions G[i :] or G[. j] for G[i '. n — 1] and t/[2 : j], respectively. That 
is, leaving the empty field before and after the symbol is a shortcut for “starting from the 
first rotation” and for “until the last rotation”, respectively. 

Below, we recall a useful pictorial representation, introduced in m and [ini , which effectively 
describes the action of the sequence of Givens rotations. We report the case n = 6, where every 
describes a Givens rotation Gi applied to the pair {i,i + 1) of consecutive rows. 


c 

g = (G2,...,G^)= C 

c 

c 


c 

Gv= [ 

c 

c 


Vi 

V2 

V3 

V4 

V5 

V6 
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The definition of 1-sequences is generalized to the case where / C is a bivariate set of 
indices, and where we consider Givens rotations Gij acting on the pair -|- 1) of consecutive 
rows for any j. In this case, the ordering on which induces orderings in any subset I of 
is defined by 

5%G (* 2 , J 2 ) ji > j 2 or (ji = j 2 and ii ^ 12 ). 

Definition 4. We say that Q = is a k-sequence of Givens rotations if I = 

{(i, j) G I i = 2,..., n — 1, f = 1,..., min(i — 1, fc)} with the order induced by With a 
slight abuse of notation we define the sequence Q[ii : 12 ] := , I' = G | i = 

ii,..., min(i 2 + k,n— 1), j = max(l, i — ^2 + 1), • • ■, min(fc, i — ii + 1), 2 ^ < 12 ^ n — 1} to 

be a slice of Q from i\ to 12 , where the ordering in I' is induced by the ordering valid on the 
parent set. 

A pictorial representation similar to the one given above can be used also in this case. For 
example, for k = 2 and n = 6 we have 

G = (G 3 , 2 , G4.2, G5.2, G2,1, Gap, G 4 , 1 , G5.1) 

that is represented by 


g = 


C 

C 

C 

c 



(4.1) 


Note that for every ^ 12 < * 3 , the slices of G can be factored in the following form: 


G[ii : is] = G[i 2 + 1 : ialGiii ■ 12 ], G*[ii ■ h] = G*[ii ■ *2]0*[*2 + 1 : * 3 ], 


where, for notational simplicity, we set G*[ii ■ * 2 ] = {G[ii ■ * 2 ])*- This property is called slicing 
of rotations. 

Note that the order is one of the orders such that Gv coincides with the multiplication of 
the vector v by the Givens rotations in Q with the order induced by the pictorial representation 

(gg. 

It is worth highlighting that the operation of slicing a fc-sequence is equivalent to removing 
the heads and tails from the sequences itself. For example the slice of G defined by ^[3 : n — 2 ] 
is obtained by taking only the bold rotations in the following picture, where n = 7, which 
correspond to G 4 _ 2 , G 5 _ 2 , Gap, G 4 ^i. 


G = 


c 



c 


c 



c 


With the basic tools introduced so far we can define the concept of Givens Vector represen¬ 
tation. 

Definition 5. A Givens Vector (GV) representation of rank k for a Hermitian quasisepa- 
rable matrix A is a triple {G, W,D) where G is a k-sequence of Givens rotations, W G ([^kx{n-i) 
and D is a diagonal matrix such that 
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• D is the diagonal of A; 

• for every i = 1,..., n — 1 the subdiagonal elements of the i-th column of A are equal to 
the last n — i elements of Q[i + 1 where we define 


0 * 

We, 


if i < n — k, 


0 , 

' {Wei)W -.n — i] 


otherwise 


where Oj is the 0 vector of length j if j > 0, and is the empty vector otherwise. That is, 
tril(A, —l)e, =Q[i + l ■]w,. 

If the triple {Q, W, D) is a GV representation of the matrix A we write that A = GV(^, W, D). 

We refer to nni for a detailed analysis of the properties of this representation. We recall 
here only the following facts: 

• If A is fc-quasiseparable then there exists a fc-sequence Q, a matrix W G cfcx(n-i) 
diagonal matrix D such that A = GV{Q, W, D). 

• If = GV{Q,W,D) for some fc-sequence S, W G and D diagonal, then A is 

j-quasiseparable with j ^ k, i.e., the existence of the representation guarantees that A 
is at most fc-quasiseparable. 

We introduce now an important operation on Givens rotations, called turnover. The follow¬ 
ing Lemma can be thought as a partial answer to the question whether two Givens rotations 
commute. It is clear that if we have Gi and Gj such that \i — j| > 1 then GiGj — GjGi = 0. 
This is also true if the two rotations act on the same rows, but it does not hold when they are 
acting on consecutive rows. In the latter case, the turnover gives a way to swap the order of the 
rotations. 

Lemma 4.1. Let Q be a sequence of Givens rotations and Fi a Givens rotation acting on the 
rows i and i + 1. Then there exists another sequence Q and a Givens rotation acting on the 
rows i — 1 and i such that 


GF, = F,_ig. 

Moreover, G differs from G only in the rotations acting on the rows with indices {i — l,i) and 
(z,z -I- I). 

See [13] for a proof of this fact. The pictorial representation of the Givens rotations can be 
helpful to understand how the turnover works. 

C . . C 

c c 

C ■ c 

c 

The above lemma can be easily extended to fc-sequence of rotations. 

Corollary 2. Let G be a k-sequence of Givens rotations and Fi a Givens rotation acting 
on the rows i and z -I- 1. Then there exists another k-sequence G and a Givens rotation Fi-k 
acting on the rows i — k and i — k -\- I such that 

GF, = F,-kG, 

where Fi-k = I if i—k < 0. Moreover, G differs from G only in the rotations of indices (z—j + 1, j) 
and {i- j,j) for j = l,...,k. 

Again, a pictorial representation of this fact can be useful to figure out the interplay of the 
rotations. Below, we report the case where z > fc -I- 1. 
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c 

[ c 
c c 
c c c 
c c 


c 

[ c c 
c c 
c c 
[ c 


The above operations are very cheap. The cost of the computation of a turnover is 0(1) in 
case of 1-sequences and 0{k) in case of fc-sequences. 

We answer now to the following question: Given a /c-sequence Q and a matrix A of qua- 
siseparable rank k, there exist appropriate matrices W S and D diagonal such that 

A = GY{g,W,D)l 

Lemma 4.2. Let A be a Hermitian matrix and Q a k-sequence of Givens rotations. Then 
B = g*A is lower handed with a bandwidth of k, i.e., b^ j = 0 for i — j > k, if and only if 
the matrix A admits a representation of the form GY{Q,W, D) for some W G and D real 

diagonal. 

Proof We first suppose that A = GV{g,W,D). Recall that, by definition of GV represen¬ 
tation, tril(A, —l)ei = g[i -I- 1 :]wj for i = 1,..., n — 1. This implies that 

g* trii(A,-i)e, = g*g[i +1 :]w, = g*[: i\g*[i +1 ■]g[i +1 ■]w, = g*[. i\w,. 

We also have 


g* iTm(A)ei = g*[. i]g*[i + 1 :] triu(A)ei = g*[. i]triu(A)ei, 

since G* [i -I- 1 :] is acting on rows that are null. So by decomposing A = tril(A, —1) -I- triu(A) we 
have 


g*Aet = g* triu(^)ei -I- g* tril(A, -l)ei = z](triu(A)ei -|- w^). 


Now observe that the rotations inside g*[: i] only act on the first i + k rows. This implies 
that, since both Wj and triu(A)ei have all the components with index strictly bigger than i + k 
equal to zero, the same must hold for g*[: i]{Wi + triu(A)ei), and this completes the proof. The 
converse is also true. In fact, if g*A is lower banded with bandwidth k we can build W by 
setting Wci = {g*[i + 1 :]Aei) [z-I- 1 : z-I-fc] and D equal to the diagonal of A. Then the equation 
A = GY{g, W, D) can be verified by direct inspection. □ 

To simplify the notation when talking about ranks in the lower part of quasiseparable matri¬ 
ces, we say that a fc-sequence g spans U G if there exists Z S such that g*U = [ q ]. 
This definition is motivated by the following 

Lemma 4.3. Ifg spans U G then, for every V G W G and D diagonal, 

the matrix Ai = UV* + GV [g, W, D) is lower k- quasiseparable and A 2 = t{UV*) + GV {g, W, D) 
is k- quasiseparable. In particular, both g*Ai and g*A2 are lower banded with bandwidth k. 

Proof. For the first part of the Lemma it suffices to observe that g*Ai is lower banded with 
bandwidth k. This follows directly by noting that A = GY{g,W, D) + UV*. Since g*UV* = 
V* and g*GY{g, W, D) is lower banded by Lemma 4.2 we conclude that also g*Ai is lower 


[21 
0. 


banded with bandwidth k. Since the strictly lower part of A2 coincides with the one of Ai we 
find that also A 2 is lower fc-quasiseparable. Given that A 2 is Hermitian, we conclude that A 2 is 
also upper fc-quasiseparable. To see that also g* A 2 is lower banded we can write 


g*A2 = g*{Ai - triu(C/V*) + triu(VC/*, 1)) = g*Ai + g*R, 
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where R is upper triangular. Since Q*, represented as a matrix, is the product of k upper 
Hessenberg matrices, it is lower banded with bandwidth k. This implies that also Q*R is lower 
banded with bandwidth k and so the same must hold also for Q*A 2 . □ 

Remark 1. The above Lemma shows how the Givens rotations in a GV representation of a 
matrix in QS'Hk cire sufficient to determine the column span of the suhmatrices contained in the 
lower triangular part. These matrices give the same information obtained by knowing the matrix 
U in the D + tiJJV*) representation. 

We need to find efficient algorithms to perform operations on this class of matrices. More 
precisely, in order to implement in terms of algorithms the constructive proofs given in Section 
we need to explain how to efficiently perform the following tasks assuming we are given GV 
representations oi M = D + t{UV*) = GV(0, W, D) S QS'Hk and of S' = Ds + t{uv*) G QS'Hi, 
where U,V G Q spans G, u,v G C” and u = Ux for some vector x G C^: 

1. Compute a GV representation of rank k of M + S. 

2. Given a unitary upper Hessenberg matrix P, compute a GV representation of rank k of 
t(PU{PV)*), and a GV representation of rank 1 of P = PMP* - t{PU{PV)*). 

We start by analyzing the problem of computing M + S. Since S = Ds + t{Uxv*), in 
view of Lemma |4.3[ we find that Q* S is lower banded with bandwidth k and so 1^ applying 


Lemma |4.2| there exists Ws G and Ds real diagonal such that S = GV(C/, Ws, Ds) is a GV 
representation of S. Given an algorithm for the computation of Ws it is possible to represent 
M + S as GV(5, Ws + W,D + Ds). 

We need to investigate how to actually compute the matrix Ws assuming we are given a 
GV representation S = GV(P, z, Ds) of S. Recall that the i-th column of Ws can be extracted 
from the components of the vector Q*[i + 1 :]Mei, as explained in Lemma 4.2 We can compute 
the whole matrix Ws at cost 0{nk) by following this procedure: 

• Compute the last column of Ws by using Lemma |4.2| This is almost cost-free since no 


rotations are involved, and the only significant element of Wse„_i is equal to z„_i. 
Compute WsCi starting from Wsei+i; this vector can be computed by using some ele¬ 
ments in Q* [z-|-l ■]Mei. In fact, since we are in the 1-quasiseparable case then z^ = ZiCi+i. 
So we have 


g*[i + l ■]Me, = g*[i + l ■]P[i + 1 :]z, = g*[ip l]g*[i -H 2 ■]P[i G- 2 ■]T[i + l]z, 

= z,g* [i + l]g*[i + 2 ■]P[i + 2 :] (ae, + ^e,+i) = 

= zig* [i + 1] ( - g* [i -|- 2 :] ALci+i -I- aCi 

\Zi+l 

Since g*[i + 2 :]Mei+i has already been computed, we obtain WsCi from WsCi+i at the 
cost of 0{k) rotations. Some care needs to be taken because in the above formula it may 
happen that Zj+i is zero. This issue can be solved by replacing Mci with F[i + 1 :]ei 
and by multiplying the computed WsCi by Zi after the computation. This way, we can 
also avoid the possible cancellation in cases where the Zi are unbalanced. 

The above procedure provides a fast method for computing a fc-quasiseparable representation 
of S'. 

We investigate now the problem of computing the residual matrix given by Theorem |2.4| and 
Corollary [2 using the GV representation of M and S. We rephrase the proof of Theorem |2.4| in 
terms of GV representations. 

Analyzing the proof of Theorem |2.4| and its corollaries, we can observe that the algorithm can 
be easily constructed if we are able to compute the residual Ri = t{FiUVF*) — Fit{UV*)F* for 
a Givens rotation Fi acting on the rows {i,i + l). In the following, we suppose that i < n — k — 1 
so that we do not need to care about “border conditions”. However, all the concepts reported 
are easily extendable to those cases by just adding some care in the process. 
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Assume we are given D, U, V, W and G such that M = D + t{UV*) = GY{G, W, D). Observe 
that we can compute an updated G such that G spans FiU. In fact, we know that G*F*FiU is 
of the form [ ^ ] where x is an appropriate k x k block. The rotation F* can be passed through 
the rotations inside G* (by properly updating them using the turnover operation) obtaining 
F*f_^.G* = G*F*. Then, by F*j^^.G*FiU = [ q ], we can conclude that also G*FiU = Fi+k [ o ] = 
[ Q ] since Fi^^ is operating on the null rows. 

Moreover, we can check that GV{G,W,D) correctly represents the lower part of t{Fi{D + 
UV*)F*) on every column but the one with indices i,i + 1. In fact, the diagonal part of M is 
left unchanged on the indices different from + For the rest of the matrix we can distinguish 
two cases and we do not need to care about D: 

• If j > i + 1, both the left multiplication by Fi and the right multiplication by F* leave 
unchanged the relevant part of U and V needed for the computation of the portion of 
the j-th column contained in the lower part of the matrix. Moreover, since in this case 
Q\j F I :] = G[j + I :] we conclude that the proposed representation for these columns is 
valid. 

• Also when j < i the right multiplication by F* does not change the j-th column at 
all. However, the left multiplication by Fi does change the j-th column and we can 
verify that tTi\{t{FiUV*F*), —l)ej = tril{FiUV,—l)ej. Recall that, by definition of GV 
representation, we have 


triI{t{UV*), —I)ej = ivi\{UV*, —I)ej = tril(M, —I)ej = G[j + i 
Since j < i we have Fi tril(C7H*, —I)ej = ti'i\{FiUV*, —I)ej so that we can write 
tril{FiUV*,-I)ej = FiG[j + 1 = G[j + 1 '.jFi+kWj = G[j + 1 


where the last two equalities follow from the definition of G and from the fact that the 
components of Wj with index bigger than j + k are zero. Thus we have that GV {G, W, D) 
provides a good representation of the lower part of the j-th column of t{FiUV*F*), as 
requested. 


A pictorial representation of these two facts can help to get a better understanding of what 
is going on (here we are fixing k = 2). The rotation Fi on the left is highlighted using the bold 
font. Equation (4.2) represents the first case, where the rotation Fi does not intersect the indices 
of the rotations in ^[j -|-1 :], and Equation (4.3) the latter case, where an update of the rotations 
is necessary. 
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(4.2) 
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(4.3) 


This means that we need to track only what happens on columns {i,i + 1). We show how 
to update D and W in the i an i + 1 components in order to account for what happens in these 
indices. Note that these columns of M can be described in the following way (we report the case 
k = 3 for simplicity): 


M 


Ci 


Ci+l 


= g[i + 2 



di 0 


0 X 


r 

wi.j 0 


0 di+i 


L 

r 

W2,i 0 

+ 

0 iciy+i 


L 

r 

W3,Z 0 


0 W2,i+1 


L 

0 0 


_0 W3y+1_ 

1 


Left and right multiplying by Fi and F* (reported with the bold font), respectively, leads to 
the following structure: 


F,MF* 


Ci+l 


— t/[i + 2 :] 


di 

0 

c 

0 

X 


0 

0 

di+1 

W2,i 

0 

+ 

0 

Wl^i+1 

W3,t 

0 


0 

W2,i+1 

0 

0 


0 

W3,i+1 




We can explicitly compute the value inside the brackets and then observe that, since Q[i + 2 :] 
= Q[i + 2 :], we have a representation of the columns of FiMF*. Now we want to hnd a Hermitian 
matrix R of the form R = aCi+ie* + aeielj^i, Wj,ij %,i+i for j = 1,..., fc and di, such that, 
writing with the rotations taken from Q, we have 


c 

[ 




c 


di 

o' 

r 

'0 

X 



di 

O' 


'0 

X 

Wl,i 

0 

L 

0 

di+1 


^ c 

Wl,i 

0 


0 

di+1 

W2,i 

0 

+ 

0 

Wl^+l 


+ R= P 

W2,i 

0 

+ 

0 

W1++1 

W3,i 

0 


0 

W2,i+1 


L 

W3,i 

0 


0 

W2,i+1 

_ 0 

0 


0 

lC3,i+l_ 


C 

_ 0 

0 


0 

W3,i+1. 


c 

(4.4) 


16 



Let C be the left matrix in (4.4|. Then the elements must coincide with the vector 

(7[3 2], the diagonal elements di and di+i are determined by the diagonal of the top 2x2 block 

of C. It remains to determine the elements Wj^i and the value a. To find them we can multiply 
on the left by the inverses of the rotations in Q[i]. We get the equation 


/ 

'o' 



di 


a 



wi.j 

Cei + 

0 


= 

W2,t 


0 




V 

0 



_ 0 _ 


We can choose a such that we get a 0 in the last component (which can always be done if the 
rotations are not trivial) and then set the values Wj^i by back substitution. 

The algorithm presented above allows us to find the residual matrix R and a new k- 
quasiseparable matrix M with the desired rotations in its representations such that M R = 
FiMF*. This is what we desire in order to implement the algorithm. The reader may wonder if 
M is exactly the matrix t{FiUVF*). This is actually true by assuming some hypothesis about 
the irreducibility of the matrices involved and that i is not bigger than n — k — 1, essentially due 
to the fact that F^MF* is forced to be equal to t{FiUV*F*) in all the elements but the 2x2 
submatrix of indices (i,i + 1). We omit the analysis of this property since, in the end, it is not 
very relevant for the algorithm. The important result is just that we have found a matrix with 
the correct Givens rotations such that the sum of the lower parts is contained in the same span. 
This last claim clearly holds in this case. 

5. Reduction algorithm. In this section we explain how the reduction algorithm can be 
constructed by using the tools presented in the previous sections. 

Recall that the matrix A = D + UV* can be represented in the more general form A = 
t{UW*) + S' + UV* , where S is a Hermitian 1-quasiseparable matrix and U,V,W G C"^^, just 
by setting W = Q and S = D. Recall also that, by Theorem |2.4[ this form is maintained by 
the trailing principal submatrices Aj G of the matrices Aj, generated at each 

step j of the algorithm, that is, Aj = t{UjW*) + Sj + UjV* . The matrices Uj and Vj are easily 
obtained by multiplying Uj-i and Vj-i by a sequence of Givens rotations and by removing the 
first row. The matrices Sj and Wj will be used to store the “residues”. 

A high level overview of the algorithm is reported in the pseudo-code of Algorithm 

The functions in the code of Algorithm perform the following operations: 
cleanColumn(u) is a function that takes as input a column vector and returns a sequence of 
Givens rotations Q such that Qv = viCi -\- ae2 for some a. 

(i?M, AT) ^conjugateAndTruncate(M, takes as input a quasiseparable Hermitian matrix 
M G QSRk and a sequence of Givens rotations G. Then it computes a quasisepara¬ 
ble representation for QMQ* — Rm where Rm is a matrix in QSRi. It returns an 
updated representation of M and the residual matrix Rm- 
S -G- Rm + Rs computes the sum of the matrices Rm,Rs S QSHi- This is done by assuming 
that both have the same sequence of Givens rotations in their representation. 
reduceTrailingBlock(A) reduces the last kxk block of the matrix using a standard Hessenberg 
reduction process. This is done because, in the last steps, the trailing block does not 
have any particular structure anymore. 

Some numerical issues might be encountered in the above version of the algorithm. For 
instance, some cancellation may happen in the sum Rm + Rsy which eventually may affect the 
Givens rotations of the representation of M. 
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Algorithm 2 High level reduction process 
1: Aii- D + UV* * 

2: M ^ 0 

3: S^O 

4: s ■<—zeros(l, n — 1) 

5: d ■<— zeros(l, n) 

6: for i = 1, ... ,n — 2 — k do 
7: G cleajiColmnn(Ai[:, 1]) 

8: d[i] 1])[1] 

9 : s[i] ^ {GAi[:, 1])[2] 

10: U ^ {G ■U)[2:n,-.] 

11: V^{G-V)[2:n,:] 

12: {Rm,M) conjugateAndTruncate(M, 0) 

13: {Rs,S) -i— conjugateAndTruncate(S', 0) 

14: M ^ M + S 

15: S i — Rm dig 

16: end for 

17: (d[n — 1 — fc : n], s[n — 1 — k : n — 1],U,V) ^ reduceTrailingBlock(A„_i_fe) 


A technique based on re-orthogonalization can be used to restore better approximations. 
Recall that the rotations inside the GV representation of M are such that G*U = [q]. Such 
rotations are not unique but (at least with some hypothesis on irreducibility) are essentially 
unique, that is, they can be determined up to a multiplicative constant of modulus 1. Based 
on this information we can compute rotations in order to obtain G*U in the desired form and 
correct the moduli of the sine and cosine inside G without altering the signs. 

This has shown to be quite effective in practice, leading to better numerical results. The 
cost of a reorthogonalization is the cost of a QR factorization of U, thus asymptotically 0(nk^). 
By performing it every k steps we have a total cost of the modified reduction algorithm that is 
still 0{n‘^k), since we need 0{n) steps to complete the reduction and 0{nk) ■ ^0{n) = 0{ri^k) 
ops. 


6. Numerical experiments. The algorithm presented in Sectionj^has been implemented 
in the Julia language. It has been run on a Laptop with an Intel(R) Core(TM) i3-2367M CPU 
running at 1.40GHz and 4 GB of RAM. 

In order to analyze the complexity and the accuracy of the results we have performed the 
following tests: 

• We have run the algorithm on matrices of different sizes but with constant quasisepa¬ 
rability rank k = 10. The purpose is to verify that the CPU time is quadratic in the 
dimension of the problem. 

• We have run the algorithm at a fixed dimension n = 200 with values of k between 5 and 
160. Here the goal is to verify that the CPU time grows linearly with the rank. 

• We have run the algorithm on some test problems in order to measure the errors on the 
eigenvalues computed starting from the final Hessenberg form. The purpose of this set 
of tests is to check the numerical stability of the algorithm. 

Every experiment has been run 10 times and the mean value of the timings has been taken. 
In Figure [6T] we have reported, in log scale, the timings for some experiments with n = lOOi for 
i = 1,..., 10. In Figure [6?^ we have reported the CPU time in the case of matrices of fixed size 
n = 400 with various quasiseparable ranks ranging from 5 to 160. 
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Size Time (s) 


10 ^ 


10 “ 



10 ^ 


10 “ 


n (Size) 


100 

0.65 

200 

2.68 

300 

6.06 

400 

11.05 

500 

17.5 

600 

25.8 

700 

35.7 

800 

49.4 

900 

67.17 

1000 

77.56 


Figure 6.1. CPU time, in seconds, for the Hessenberg reduction of a diagonal plus rank 10 matrix of size 
n. Here the line is the plot of'yn^ for an appropriate 7 . It is evident the quadratic behavior of the time. 



QS Rank Time (s) 

■5 Hue 

10 11.42 

20 14.36 

40 20.58 

80 31.02 

160 44.25 


Figure 6.2. CPU time, in seconds, for the Hessenberg reduction of a 400 X 400 diagonal plus rank k matrix. 


Looking at the results in Figure [6?^ we see that the complexity in the rank is almost sublinear 
at the start. This is due to the inefficiency of operations on small matrices and the overhead of 
these operations in our Julia implementation. The linear trend starts to appear for larger ranks. 

As a last experiment in Figure [ffi3| and Figure 6.4 we have reported the absolute and relative 
errors, respectively, on eigenvalue computations for various sizes and fixed quasiseparable rank. 
The errors were obtained as differences between the eigenvalues computed from the starting full 
matrix using the QR algorithm and the QR algorithm applied to the Hessenberg matrix provided 
by our algorithm. 

The matrices in these examples have been obtained by using the randn function that con¬ 
structs matrices whose elements are drawn from a N{0, 1) Gaussian distribution. This function 
has been used to construct D, U and V diagonal and nxk, respectively, such that A = D + UV*. 


7. An application. Let P{x) = J2i=o ^ matrix polynomial where Pi are k x k 

matrices. In [2] a companion linearization A for P{x) has been introduced where A is an n x n 
matrix, n = dk, ot the form O) with n = dk. The computation of the eigenvalues of P{x), 
that is, the solutions of the equation detP(a;) = 0, is therefore reduced to solving the linear 
eigenvalue problem for the matrix A. The availability of the Hessenberg form H ol A with 
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n (Size) 


Size 

Mean 

Minimum 

Maximum 

40.0 

5.52e-14 

l.lle-15 

3.97e-13 

80.0 

2.59e-13 

0.0 

2.43e-12 

160.0 

5.23e-13 

5.32e-15 

3.74e-12 

320.0 

5.06e-12 

1.49e-14 

2.41e-10 

640.0 

1.80e-10 

1.42e-13 

2.43e-9 

1280.0 

8.43e-9 

6.43e-15 

3.32-7 


Figure 6.3. Absolute errors on eigenvalues computation for random matrices of quasiseparable rank 30 and 


variable sizes. 



Size 

Mean 

Minimum 

Maximum 

40.0 

9.13e-15 

2.86e-16 

1.42e-13 

80.0 

2.55e-13 

0.0 

3.22e-12 

160.0 

1.83e-12 

1.92e-16 

1.07e-10 

320.0 

9.66e-12 

4.91e-16 

4.49e-10 

640.0 

7.79e-10 

1.47e-15 

5.34e-8 

1280.0 

5.15e-8 

2.69e-15 

6.35e-6 


Figure 6.4. Relative errors on eigenvalues computation for random matrices of quasiseparable rank 30 and 
variable sizes. 


the quasiseparability structure, enables one to apply the QR iteration to i/ at a low cost by 
exploiting the both the Hessenberg and the quasiseparable structures. 

A different approach to solve the equation detP(a;) = 0, followed in [T], consists in applying 
the Ehrlich-Aberth iteration to the polynomial detP(x), or alternatively, to represent the poly¬ 
nomial P{x) in secular form [3]. In this case, one has to compute detP(x) at different values 
Indeed, the evaluation of detP(a:) at a single value x = ^ can be performed by 
applying the Horner rule in order to compute the matrix P(U and then by applying Gaussian 
elimination for computing the determinant det P(U- The overall cost is 0{dk^ + k^) so that the 
computation aX n = dk different points has the cost 0{d^k^ + dk^). 

The availability of the structured Hessenberg form H allows us to reduce this cost. We 
will show that computing det(a;/ — A) = det(a;/ — H) can be performed in 0{nk) = 0{dk'^) 
operations. Asymptotically, the value of this cost is the minimum that we can obtain. In fact, 
the matrix A is defined by 2nk + n entries, so that any algorithm which takes in input these 
2nk + n values must perform at least nk + n/2 operations. In fact each pair of input data must 
be involved in at least an operation. In the case where we have to compute det{x/ — A) at dk 
different values we obtain the cost 0{d^k^) which improves the cost required by the Horner rule 
applied to the matrix polynomial. 

The algorithm for computing det(a:/ — H) at a low cost relies on the Hyman method [13] . 
Assume that H is in /c-quasiseparable Hessenberg form. Consider the system {xl — H)v = 
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aei, set = 1 so that the equations from the second to the last one form a triangular k- 
quasiseparable system. The hrst equation provides the value of a after that ui,... ,u„_i have 
been computed. Since a triangular quasiseparable system can be solved with 0(nk) ops, we 
are able to compute a at the same cost. On the other hand, by the Cramer rule we find that 
^ = Vn = Ci{Y\i =2 n / det{xl—H) which provides the sought value of det{xI—H). A similar 

approach can be used for computing the Newton correction p(x)/p' {x) for p{x) = det(a;/ — H). 
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